AITopics | labeled data

Collaborating Authors

labeled data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Star Temporal Classification: Sequence Modeling with Partially Labeled Data

Neural Information Processing SystemsDec-24-2025, 06:06:13 GMT

We develop an algorithm which can learn from partially labeled and unsegmented sequential data. Most sequential loss functions, such as Connectionist Temporal Classification (CTC), break down when many labels are missing. We address this problem with Star Temporal Classification (STC) which uses a special star token to allow alignments which include all possible tokens whenever a token could be missing. We express STC as the composition of weighted finite-state transducers (WFSTs) and use GTN (a framework for automatic differentiation with WFSTs) to compute gradients. We perform extensive experiments on automatic speech recognition. These experiments show that STC can close the performance gap with supervised baseline to about 1% WER when up to 70% of the labels are missing. We also perform experiments in handwriting recognition to show that our method easily applies to other temporal classification tasks.

name change, sequence modeling, star temporal classification, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.61)
Information Technology > Artificial Intelligence > Machine Learning (0.59)

Add feedback

Star Temporal Classification: Sequence Modeling with Partially Labeled Data

Neural Information Processing SystemsOct-11-2024, 03:22:48 GMT

sequence modeling, star temporal classification, temporal classification, (5 more...)

Neural Information Processing Systems

Genre: Play > Prospect (0.52)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.76)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.66)

Add feedback

Learning with Explanation Constraints

Pukdee, Rattana, Sam, Dylan, Kolter, J. Zico, Balcan, Maria-Florina, Ravikumar, Pradeep

arXiv.org Machine LearningDec-22-2023

As larger deep learning models are hard to interpret, there has been a recent focus on generating explanations of these black-box models. In contrast, we may have apriori explanations of how models should behave. In this paper, we formalize this notion as learning from explanation constraints and provide a learning theoretic framework to analyze how such explanations can improve the learning of our models. One may naturally ask, "When would these explanations be helpful?" Our first key contribution addresses this question via a class of models that satisfies these explanation constraints in expectation over new data. We provide a characterization of the benefits of these models (in terms of the reduction of their Rademacher complexities) for a canonical class of explanations given by gradient information in the settings of both linear models and two layer neural networks. In addition, we provide an algorithmic solution for our framework, via a variational approximation that achieves better performance and satisfies these constraints more frequently, when compared to simpler augmented Lagrangian methods to incorporate these explanations. We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.

artificial intelligence, constraint, machine learning, (17 more...)

arXiv.org Machine Learning

2303.14496

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Conformal Prediction with Partially Labeled Data

Javanmardi, Alireza, Sale, Yusuf, Hofman, Paul, Hüllermeier, Eyke

arXiv.org Artificial IntelligenceJun-1-2023

While the predictions produced by conformal prediction are set-valued, the data used for training and calibration is supposed to be precise. In the setting of superset learning or learning from partial labels, a variant of weakly supervised learning, it is exactly the other way around: training data is possibly imprecise (set-valued), but the model induced from this data yields precise predictions. In this paper, we combine the two settings by making conformal prediction amenable to set-valued training data. We propose a generalization of the conformal prediction procedure that can be applied to set-valued training and calibration data. We prove the validity of the proposed method and present experimental studies in which it compares favorably to natural baselines.

artificial intelligence, machine learning, prediction, (15 more...)

arXiv.org Artificial Intelligence

2306.01191

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)

Add feedback

Automatic Generation of Labeled Data for Video-Based Human Pose Analysis via NLP applied to YouTube Subtitles

Dill, Sebastian, Zhihan, Susi, Rohr, Maurice, Sharbafi, Maziar, Antink, Christoph Hoog

arXiv.org Artificial IntelligenceMay-2-2023

With recent advancements in computer vision as well as machine learning (ML), video-based at-home exercise evaluation systems have become a popular topic of current research. However, performance depends heavily on the amount of available training data. Since labeled datasets specific to exercising are rare, we propose a method that makes use of the abundance of fitness videos available online. Specifically, we utilize the advantage that videos often not only show the exercises, but also provide language as an additional source of information. With push-ups as an example, we show that through the analysis of subtitle data using natural language processing (NLP), it is possible to create a labeled (irrelevant, relevant correct, relevant incorrect) dataset containing relevant information for pose analysis. In particular, we show that irrelevant clips ($n=332$) have significantly different joint visibility values compared to relevant clips ($n=298$). Inspecting cluster centroids also show different poses for the different classes.

artificial intelligence, machine learning, video, (16 more...)

arXiv.org Artificial Intelligence

2304.14489

Country: Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)

Genre: Research Report (0.83)

Industry: Health & Medicine > Consumer Health (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Active Learning: Learning with Limited Labeled Data in Python (Scikit-learn, Active Learning Lib) - Code Armada, LLC

#artificialintelligenceApr-11-2023, 11:35:28 GMT

Active Learning: Learning with Limited Labeled Data in Python (Scikit-learn, Active Learning Lib) Active Learning is a machine learning approach that enables the selection of the most informative data points to be labeled by an oracle, thereby reducing the number of labeled data points required to train a model. Active Learning is useful in scenarios where labeled data is limited or expensive to acquire. Active Learning can help improve the accuracy of machine learning models with fewer labeled data points. Learning with Limited Labeled Data in Python Python is a popular language for machine learning, and several libraries support Active Learning. In this tutorial, we will use the Scikit-learn library to train a model and the Active Learning library to select informative data points to be labeled. Import Libraries We will start by importing the necessary libraries, including Scikit-learn for training the model, NumPy for numerical computations, and the Active Learning library for selecting informative data points to be labeled. import numpy as np from sklearn.datasets import make_classification from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split from modAL.uncertainty import uncertainty_sampling Generate Data Next, we will generate some random data for training and testing the model. # Generate random data for […]

active learning, artificial intelligence, machine learning, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.52)

Add feedback

Information Regularization with Partially Labeled Data

Neural Information Processing SystemsApr-6-2023, 16:17:25 GMT

Classification with partially labeled data requires using a large number of unlabeled examples (or an estimated marginal P (x)), to further con- strain the conditional P (yjx) beyond a few available labeled examples. We formulate a regularization approach to linking the marginal and the conditional in a general way. The regularization penalty measures the information that is implied about the labels over covering regions. No parametric assumptions are required and the approach remains tractable even for continuous marginal densities P (x). We develop algorithms for solving the regularization problem for finite covers, establish a limiting differential equation, and exemplify the behavior of the new regulariza- tion approach in simple cases.

information regularization, labeled data

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Hybrid Models for Image Annotation with Partially Labeled Data

Neural Information Processing SystemsApr-6-2023, 14:16:54 GMT

Extensive labeled data for image annotation systems, which learn to assign class labels to image regions, is difficult to obtain. We explore a hybrid model framework for utilizing partially labeled data that integrates a generative topic model for image appearance with discriminative label prediction. We propose three alternative formulations for imposing a spatial smoothness prior on the image labels. Tests of the new models and some baseline approaches on two real image datasets demonstrate the effectiveness of incorporating the latent structure.

image annotation, labeled data, learning hybrid model

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Dynamic Flows on Curved Space Generated by Labeled Data

Hua, Xinru, Nguyen, Truyen, Le, Tam, Blanchet, Jose, Nguyen, Viet Anh

arXiv.org Artificial IntelligenceJan-31-2023

The scarcity of labeled data is a long-standing challenge for many machine learning tasks. We propose our gradient flow method to leverage the existing dataset (i.e., source) to generate new samples that are close to the dataset of interest (i.e., target). We lift both datasets to the space of probability distributions on the feature-Gaussian manifold, and then develop a gradient flow method that minimizes the maximum mean discrepancy loss. To perform the gradient flow of distributions on the curved feature-Gaussian space, we unravel the Riemannian structure of the space and compute explicitly the Riemannian gradient of the loss function induced by the optimal transport metric. For practical applications, we also propose a discretized flow, and provide conditional results guaranteeing the global convergence of the flow to the optimum. We illustrate the results of our proposed gradient flow method on several real-world datasets and show our method can improve the accuracy of classification models in transfer learning settings.

artificial intelligence, curved space generated, machine learning, (2 more...)

arXiv.org Artificial Intelligence

doi: 10.24963/ijcai.2023/423

2302.00061

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback